test: add integration tests for "pebble run" #497

IronCore864 · 2024-09-04T05:12:36Z

A PoC to solve the Pebble integration test, see issue here.

Two issues I want to call out:

It seems the test suite can't be used with build flags, it can't collect any tests: testing: warning: no tests to run.

Maybe this is possible but I haven't made it work yet, and I think this could be the reason: as per the Golang Doc, the build tag lists the conditions under which a file should be included in the package, and maybe this is the reason why it doesn't work with suite.

It means we can't do something like func (s *IntegrationSuite) TestXXX(c *C), but can only do func TestXXX(t *testing.T); which is OK I think, just something worth mentioning.

In the PoC I built the binary in setup, then created some functions to help create layers, run pebble and return logs. The logs part is tricky:

The daemon is a long-running process and we can't wait for it to "finish". Here I used something like "if there are no new logs in the past second, kill the process and return". It doesn't feel ideal to me.

I had another draft where I simply passed a value for timeout into it, sleep, then kill.

Both worked, but I'm not sure if this is the best we can do here.

dimaqq

The approach overall makes sense, though I wonder: do build flags allow you to fudge the code under test somehow?

I can see how this allows running integration tests overall, but I'm a bit unclear about how this helps testing code that needs root or what not.

P.S. maybe get someone from Juju to review the go bits?

internals/testintegration/pebble_run_test.go

internals/testintegration/utils.go

benhoyt

Leaving comments per our discussion.

internals/testintegration/pebble_another_test.go

internals/testintegration/pebble_run_test.go

internals/testintegration/utils.go

internals/testintegration/pebble_run_test.go

tests/README.md

tests/main_test.go

tests/utils.go

IronCore864 · 2024-09-11T07:42:30Z

Resolving all the above comments, and here is a list of test cases for pebble run:

Pebble Run Tests (`run_test.go`)

1 `TestNormal`

services:
    svc1:
        override: replace
        command: /bin/sh -c "touch svc1; sleep 1000"
        startup: enabled
    svc2:
        override: replace
        command: /bin/sh -c "touch svc2; sleep 1000"
        startup: enabled

Start pebble.
Pebble will start svc1 and svc2
Check svc1 and svc2 are running: check files svc1 and svc2.

Need a helper function like waitForServices with a timeout to check files.

2 `TestCreateDirs`

tmpDir := t.TempDir()
pebbleDir := filepath.Join(tmpDir, "PEBBLE_HOME")

Start pebble with --create-dirs.
Pebble will create dir tmpDir/PEBBLE_HOME
Check dir tmpDir/PEBBLE_HOME exists.

No need for a helper function, use os.Stat.

3 `TestHold`

services:
    svc1:
        override: replace
        command: /bin/sh -c "touch /home/ubuntu/PEBBLE_HOME/svc1; sleep 1000"
        startup: enabled

Start pebble with --hold.
Pebble daemon starts.
Wait for log "Started daemon." before issuing any check (the waitForLogs helper function is needed).
Sleep a second before checking services (immediate check can't guarantee that svc1 is started shortly after the log "Started daemon.").
Check that svc1 is not running: check file svc1 does not exist. No need for a helper, use os.Stat.

4 `TestHttpPort`

Start pebble with --http=:4000.
Pebble will start on port 4000
Check port 4000 is being listened by Pebble.

Need a helper function like func isPortInUseByProcess(port string, processName string) bool {}.

5 `TestVerbose`

services:
    svc1:
        override: replace
        command: /bin/sh -c "cat /home/ubuntu/PEBBLE_HOME/layers/001-layer.yaml; sleep 1000"
        startup: enabled

Start pebble with --verbose
Check "services:", "svc1:", "override: replace", "startup: enabled" are in the logs, need the waitForLogs helper func.

6 `TestArgs`

services:
    svc1:
        override: replace
        command: /bin/sh
        startup: enabled

Start pebble with --verbose --args svc1 -c "cat /home/ubuntu/PEBBLE_HOME/layers/001-layer.yaml; sleep 1000" (verbose is used so that I can test args, and I use the layers file for testing, have to be a bit innovative here to test args...)
Same checks as the previous test case, waitForLogs.

7 `TestIdentities`

Create a file named idents-add.yaml, maybe in pebbleDir:

identities:
    bob:
        access: admin
        local:
            user-id: 42
    alice:
        access: read
        local:
            user-id: 2000

Start pebble with --identities.
Run pebble identity bob command and check access: admin, user-id: 42 are in the output.
Run pebble identity alice command and check access: read, user-id: 2000 are in the output.

Need a helper like runPebbleCmdAndCheckOutput.

Summary

waitForLogs is needed
a few more helpers are needed, see the above test cases.

Since these test cases are relatively simple to implement, I will go ahead and do them now. We can review and refactor later.

IronCore864 · 2024-09-12T14:51:27Z

I have restructured the files and added the tests mentioned above.

Note that I didn't strictly follow the "rule of 3" when creating those helper functions, some are only used twice, some even just once. The reason is that if some code weren't put in a separate function, the test functions would become much longer and harder to read. So, I kept them as they were. Instead of thinking of them as helper functions, think of them as a way to refactor the tests to improve readability, and if they don't fit future needs, we can refactor them later.

Todo:

Add a github actions workflow to run these integration tests.
Use build flags to orchestrate root tests.

benhoyt

Thanks for this -- I think it's heading in the right direction. Lots of comments, but they're basically about how we structure the helpers and make the tests more obvious. I think in general it's better to have fewer, more generic helpers. One tell-tale sign is that same of the names start to get long or specific, like isPortUsedByProcess (very specific to a particular test), writeIdentitiesFile (doesn't actually do anything specific to identities), or runPebbleCmdAndCheckOutput (often an "and" in a function name means you should split it). Happy to discuss any of these further on video if you want.

tests/README.md

tests/main_test.go

tests/run_test.go

tests/main_test.go

benhoyt · 2024-09-15T23:48:47Z

The reason is that if some code weren't put in a separate function, the test functions would become much longer and harder to read. So, I kept them as they were. Instead of thinking of them as helper functions, think of them as a way to refactor the tests to improve readability.

I realise this is somewhat subjective, but I think mere length is okay. I definitely disagree about "harder to read" -- I think several of the helpers obscure the logic of the test and make it unclear what it's actually testing. I think there are some minor tweaks we can make to the structure and names to help with this, and I've argued my case in the comments above.

benhoyt

FWIW, I just ran the integration tests with the pebble daemon running in another window, and got this error:

$ go test -count=1 -tags=integration ./tests/
--- FAIL: TestHttpPort (3.05s)
    run_test.go:113: Error waiting for logs: timed out waiting for log: Started daemon

It seems to me the two instances should be completely independent, as the integration tests point to a temporary PEBBLE directory. Any ideas why this would fail?

IronCore864 · 2024-09-18T14:03:40Z

FWIW, I just ran the integration tests with the pebble daemon running in another window, and got this error:
$ go test -count=1 -tags=integration ./tests/
--- FAIL: TestHttpPort (3.05s)
    run_test.go:113: Error waiting for logs: timed out waiting for log: Started daemon
It seems to me the two instances should be completely independent, as the integration tests point to a temporary PEBBLE directory. Any ideas why this would fail?

I saw this comment at last after fixing all comments above, and after the fixes, I could not reproduce. Could you confirm?

benhoyt · 2024-09-18T23:10:35Z

Regarding the TestHttpPort failure when the daemon is running -- no, I can't reproduce now either. We'll call it fixed.

benhoyt

Nice, this is much cleaner now. A few minor comments, and one comment about the nested goroutines in pebbleRun.

.github/workflows/integration-test.yml

tests/README.md

tests/run_test.go

tests/main_test.go

tests/run_test.go

benhoyt

Looking good, thanks!

One remaining thing I noticed after running the integration tests -- they don't seem to clean up after themselves very well. After running them, I get:

$ ps aux | grep sleep
ben       137333  0.0  0.0   2800  1536 pts/2    S    08:56   0:00 /bin/sh -c touch /tmp/TestStartupEnabledServices3374203583/001/svc1; sleep 1000
ben       137334  0.0  0.0   2800  1536 pts/2    S    08:56   0:00 /bin/sh -c touch /tmp/TestStartupEnabledServices3374203583/001/svc2; sleep 1000
ben       137337  0.0  0.0   8288  1920 pts/2    S    08:56   0:00 sleep 1000
ben       137338  0.0  0.0   8288  1920 pts/2    S    08:56   0:00 sleep 1000
ben       137373  0.0  0.0   2800  1536 pts/2    S    08:56   0:00 /bin/sh -c echo 'hello world'; sleep 1000
ben       137374  0.0  0.0   8288  1920 pts/2    S    08:56   0:00 sleep 1000
ben       137384  0.0  0.0   2800  1536 pts/2    S    08:56   0:00 /bin/sh -c echo 'hello world'; sleep 1000
ben       137385  0.0  0.0   8288  1920 pts/2    S    08:56   0:00 sleep 1000

Any idea why? Maybe we can do some logging to see why. I would have thought sending SIGINT should cause Pebble to stop the running services before exiting, but it's clearly not doing that (properly, at any rate).

One thing we could do to mitigate it (but not fix it, so we should still look into it) is changing the sleep 1000 to sleep 10 -- that's still plenty long enough for these tests, but at least they'll exit sooner themselves.

I'll also ask Harry to review this, as he originally created the issue.

hpidcock

Looking good so far. Just a few nitpicks.

tests/main_test.go

tests/run_test.go

benhoyt · 2024-09-20T00:56:28Z

Agreed with all of Harry's comments (but a slightly different explicit approach than --no-build) -- thanks for the review.

IronCore864 · 2024-09-20T02:57:53Z

Reply to the above comments:

I have added some comments for all the tests and most of the helper functions.
A new flag -pebbleBin is added as Ben suggested, so that if set, it will be used for the integration tests rather than building one.
Services not stopped after SIGINT is sent to Pebble is mitigated by changing sleep 1000 to sleep 10, but it's not resolved.

benhoyt

Great, thanks!

benhoyt · 2024-09-20T02:59:22Z

Services not stopped after SIGINT is sent to Pebble is mitigated by changing sleep 1000 to sleep 10, but it's not resolved.

Can you please spend a bit of time looking into why is? I just don't want to paper over something if it's an actual bug.

benhoyt · 2024-09-22T22:21:39Z

I looked into this further, and debugged by printing out the Pebble/stderr logs (and running the tests with -v so they showed up). Then I enabled PEBBLE_DEBUG=1 in the subprocess, and this helped me see that when SIGTERM was sent, it was before the "start" had completed (which takes 1s). So stopRunningServices thought there were "No services to stop." as they were in "starting" state, and didn't try to send SIGTERM to the children. Arguably stopRunningServices should include the services in "starting" state. I might open a separate issue on that.

In the meantime, it's probably best if we wait till the "startup: enabled" services are started when the daemon starts up, and the simplest way to do this now is to wait for the "Started default services ..." log. Note that the full log looks like this.

2024-09-22T22:10:40.147Z [pebble] Started default services with change 1.

I recommend the following diff, which cleans up nicely in my tests:

diff --git a/tests/run_test.go b/tests/run_test.go
index c73c365..c98c693 100644
--- a/tests/run_test.go
+++ b/tests/run_test.go
@@ -49,7 +49,8 @@ services:
 
        createLayer(t, pebbleDir, "001-simple-layer.yaml", layerYAML)
 
-       _, _ = pebbleRun(t, pebbleDir)
+       _, stderrCh := pebbleRun(t, pebbleDir)
+       waitForLog(t, stderrCh, "pebble", "Started default services", 3*time.Second)
 
        waitForFile(t, filepath.Join(pebbleDir, "svc1"), 3*time.Second)
        waitForFile(t, filepath.Join(pebbleDir, "svc2"), 3*time.Second)
@@ -141,6 +142,7 @@ services:
        stdoutCh, stderrCh := pebbleRun(t, pebbleDir, "--verbose")
        waitForLog(t, stderrCh, "pebble", "Started daemon", 3*time.Second)
        waitForLog(t, stdoutCh, "svc1", "hello world", 3*time.Second)
+       waitForLog(t, stderrCh, "pebble", "Started default services", 3*time.Second)
 }
 
 // TestArgs tests that Pebble provides additional arguments to a service
@@ -166,6 +168,7 @@ services:
        )
        waitForLog(t, stderrCh, "pebble", "Started daemon", 3*time.Second)
        waitForLog(t, stdoutCh, "svc1", "hello world", 3*time.Second)
+       waitForLog(t, stderrCh, "pebble", "Started default services", 3*time.Second)
 }
 
 // TestIdentities tests that Pebble seeds identities from a file

benhoyt · 2024-09-22T22:28:12Z

I've opened #502 to track having Pebble terminate services in "starting" state as well.

IronCore864 · 2024-09-23T07:09:30Z

While testing the leaked "sleep" issue, I found another place that leaks a sleep. Since it's not related to this feature, I will merge this PR and continue debugging it in another branch.

IronCore864 added 2 commits September 4, 2024 12:46

test: integration test poc

236800d

test: integration test poc

682c2aa

dimaqq reviewed Sep 4, 2024

View reviewed changes

internals/testintegration/pebble_run_test.go Outdated Show resolved Hide resolved

internals/testintegration/utils.go Outdated Show resolved Hide resolved

internals/testintegration/utils.go Outdated Show resolved Hide resolved

benhoyt reviewed Sep 10, 2024

View reviewed changes

IronCore864 added 2 commits September 10, 2024 16:54

chore: refactor after discussion

a45bee7

chore: add some comments

e0bb83b

benhoyt reviewed Sep 11, 2024

View reviewed changes

IronCore864 added 2 commits September 11, 2024 15:43

chore: refactor after discussion and initial review

574fa49

tests: add integration tests for pebble run

6b5e33f

IronCore864 marked this pull request as ready for review September 12, 2024 14:51

IronCore864 requested a review from benhoyt September 12, 2024 14:51

chore: update integration tests readme

bf21bdb

benhoyt requested changes Sep 15, 2024

View reviewed changes

benhoyt changed the title ~~test: integration test poc~~ test: add integration tests for "pebble run" Sep 16, 2024

benhoyt reviewed Sep 16, 2024

View reviewed changes

test: add gha workflow for integration test

e8f37b2

chore: refactor after code review

949a955

benhoyt requested changes Sep 19, 2024

View reviewed changes

IronCore864 added 2 commits September 19, 2024 17:14

chore: refactor after code review

d54ce7d

chore: refactor after review

e6c29dc

IronCore864 requested a review from benhoyt September 19, 2024 11:11

benhoyt approved these changes Sep 19, 2024

View reviewed changes

hpidcock reviewed Sep 19, 2024

View reviewed changes

chore: refactor after review

b206103

chore: add comments for tests

298c839

benhoyt approved these changes Sep 20, 2024

View reviewed changes

benhoyt mentioned this pull request Sep 22, 2024

stopRunningServices should probably also stop "starting" services #502

Open

chore: fix leaked sleep services

a3ad795

Merge branch 'master' into integration-test-poc

1943b76

IronCore864 merged commit 0ca17af into canonical:master Sep 23, 2024
18 checks passed

IronCore864 deleted the integration-test-poc branch September 23, 2024 07:18

benhoyt mentioned this pull request Sep 24, 2024

Add unit and integration tests for pebble run #351

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add integration tests for "pebble run" #497

test: add integration tests for "pebble run" #497

IronCore864 commented Sep 4, 2024 •

edited

Loading

dimaqq left a comment

benhoyt left a comment

IronCore864 commented Sep 11, 2024 •

edited

Loading

IronCore864 commented Sep 12, 2024 •

edited

Loading

benhoyt left a comment

benhoyt commented Sep 15, 2024

benhoyt left a comment

IronCore864 commented Sep 18, 2024

benhoyt commented Sep 18, 2024

benhoyt left a comment

benhoyt left a comment

hpidcock left a comment

benhoyt commented Sep 20, 2024

IronCore864 commented Sep 20, 2024

benhoyt left a comment

benhoyt commented Sep 20, 2024

benhoyt commented Sep 22, 2024

benhoyt commented Sep 22, 2024

IronCore864 commented Sep 23, 2024

test: add integration tests for "pebble run" #497

test: add integration tests for "pebble run" #497

Conversation

IronCore864 commented Sep 4, 2024 • edited Loading

dimaqq left a comment

Choose a reason for hiding this comment

benhoyt left a comment

Choose a reason for hiding this comment

IronCore864 commented Sep 11, 2024 • edited Loading

Pebble Run Tests (run_test.go)

1 TestNormal

2 TestCreateDirs

3 TestHold

4 TestHttpPort

5 TestVerbose

6 TestArgs

7 TestIdentities

Summary

IronCore864 commented Sep 12, 2024 • edited Loading

benhoyt left a comment

Choose a reason for hiding this comment

benhoyt commented Sep 15, 2024

benhoyt left a comment

Choose a reason for hiding this comment

IronCore864 commented Sep 18, 2024

benhoyt commented Sep 18, 2024

benhoyt left a comment

Choose a reason for hiding this comment

benhoyt left a comment

Choose a reason for hiding this comment

hpidcock left a comment

Choose a reason for hiding this comment

benhoyt commented Sep 20, 2024

IronCore864 commented Sep 20, 2024

benhoyt left a comment

Choose a reason for hiding this comment

benhoyt commented Sep 20, 2024

benhoyt commented Sep 22, 2024

benhoyt commented Sep 22, 2024

IronCore864 commented Sep 23, 2024

IronCore864 commented Sep 4, 2024 •

edited

Loading

IronCore864 commented Sep 11, 2024 •

edited

Loading

Pebble Run Tests (`run_test.go`)

1 `TestNormal`

2 `TestCreateDirs`

3 `TestHold`

4 `TestHttpPort`

5 `TestVerbose`

6 `TestArgs`

7 `TestIdentities`

IronCore864 commented Sep 12, 2024 •

edited

Loading